Improving automatic forced alignment for dysarthric speech transcription
نویسندگان
چکیده
Dysarthria is a motor speech disorder due to neurologic deficits. The impaired movement of muscles for speech production leads to disordered speech where utterances have prolonged pause intervals, slow speaking rates, poor articulation of phonemes, syllable deletions, etc. These present challenges towards the use of speech technologies for automatic processing of dysarthric speech data. In order to address these challenges, this work begins by addressing the performance degradation faced in forced alignment. We perform initial alignments to locate long pauses in dysarthric speech and make use of the pause intervals as anchor points. We apply speech recognition for word lattice outputs for recovering the time-stamps of the words in disordered or incomplete pronunciations. By verifying the initial alignments with word lattices, we obtain the reliably aligned segments. These segments provide constraints for new alignment grammars, that can improve alignment and transcription quality. We have applied the proposed strategy to the TORGO corpus and obtained improved alignments for most dysarthric speech data, while maintaining good alignments for non-dysarthric speech data.
منابع مشابه
An HMM-based phoneme recognizer applied to assessment of dysarthric speech
This paper describes work on the development of an HMM-based system for automatic speech assessment, particularly of dysarthric speech. As a first step, we compare recognizer performance on a closed-set, forced choice identification test of dysarthric speech with performance on the same test by untrained listeners. Results indicate that HMM recognition accuracy averaged over all utterances of a...
متن کاملAn Automatic Dysarthric Speech Recognition Approach using Deep Neural Networks
Transcribing dysarthric speech into text is still a challenging problem for the state-of-the-art techniques or commercially available speech recognition systems. Improving the accuracy of dysarthric speech recognition, this paper adopts Deep Belief Neural Networks (DBNs) to model the distribution of dysarthric speech signal. A continuous dysarthric speech recognition system is produced, in whic...
متن کاملEvaluation of a Phone-Based Anomaly Detection Approach for Dysarthric Speech
Perceptual evaluation is still the most common method in clinical practice for the diagnosing and the following of the condition progression of people with speech disorders. Many automatic approaches were proposed to provide objective tools to deal with speech disorders and help professionals in the severity evaluation of speech impairments. This paper investigates an automatic phone-based anom...
متن کاملAutomatic Phonetic Transcription in Two Steps: Forced Alignment and Burst Detection
In the last decade, there was a growing interest in conversational speech in the fields of human and automatic speech recognition. Whereas for the varieties spoken in Germany, both resources and tools are numerous, for Austrian German only recently the first corpus of read and conversational speech was collected. In the current paper, we present automatic methods to phonetically transcribe and ...
متن کاملMultiple-Pronunciation Lexical Modeling Based on Phoneme Confusion Matrix for Dysarthric Speech Recognition
In this paper, we propose speaker-dependent multiple-pronunciation lexical modeling for improving the performance of dysarthric automatic speech recognition (ASR). For each dysarthric speaker, a phoneme confusion matrix is first constructed from the results of phoneme recognition. Then, pronunciation variation rules are extracted by investigating the phoneme confusion matrix, and they are incor...
متن کامل